A Computational Approach to the Discovery and Representation of Lexical Chunks
نویسندگان
چکیده
Lexical chunks have in recent years become widely recognized as a crucial aspect of second language competence. We address two major sorts of challenge that chunks pose for lexicography and describe computational approaches to addressing these challenges. The first challenge is lexical knowledge discovery, that is, the need to uncover which strings of words constitute chunks worthy of learners’ attention. The second challenge is the problem of representation, that is, how such knowledge can be made accessible to learners. To address the first challenge, we propose a greedy algorithm run on 20-million words of BNC that iterates applications of word association measures on increasingly longer n-grams. This approach places priority on high recall and then attempts to isolate false positives by sorting mechanisms. To address the challenge of representation we propose embedding the algorithm in a browser-based agent as an extension of our current browser-based collocation detection tool.
منابع مشابه
Key Lexical Chunks in Applied Linguistics Article Abstracts
In any discourse domain, certain chunks are particularly frequent and deserve attention by the novice to be initiated and by the expert to maintain a sense of community. To make a relevant contribution to the awareness about applied linguistics texts and discourse, this study attempted to develop lists of lexical chunks frequently used in the abstracts of applied linguistics journals. The abstr...
متن کاملAppropriation Based -Syllabus and Advanced EFL Learners’ Speaking Skill: The Case of Chunks-on-Card Activities
The impetus for conducting the present study came from Thornbury's (2005) approach to teach speaking in which he claimed that awareness-raising techniques, along with appropriation strategies, facilitate the process of teaching and learning speaking. Therefore, the present study attempted to explore the impact of the appropriation-based syllabus to teach speaking by using chunks-on-card...
متن کاملThe Effect of Extensive Reading on Iranian EFL Learners’ Lexical Bundle Performance: a comparative study of adaptive and authentic texts
Formulaic language and sequence as the core characteristic of real-life language and native-like fluency, has been a subject of inquiry in recent decades. The aim of the present study is to investigate the effects of two extensive reading text types, i.e., adaptive and authentic, on Iranian EFL learners’ development of lexical bundles. To this aim, 20 intermediate EFL learners were chosen to pa...
متن کاملMental Representation of Cognates/Noncognates in Persian-Speaking EFL Learners
The purpose of this study was to investigate the mental representation of cognate and noncognate translation pairs in languages with different scripts to test the prediction of dual lexicon model (Gollan, Forster, & Frost, 1997). Two groups of Persian-speaking English language learners were tested on cognate and noncognate translation pairs in Persian-English and English-Persian directions with...
متن کامل